Demand-driven dynamic aggregate

ABSTRACT

An aggregate is generated. Upon receiving a second query comprising a filter criterion, a determination is made as to whether at least a threshold number of previous first queries comprises a similar filter criterion, and if so generating an aggregate using the similar filter criterion as an aggregation criterion, such that future queries comprising the similar filter criterion are satisfied by the aggregate.

PRIOR FOREIGN APPLICATION

This application claims priority from United Kingdom patent applicationnumber 1404521.5, filed Mar. 14, 2014, which is hereby incorporatedherein by reference in its entirety.

BACKGROUND

One or more aspects relate generally to generating an aggregate, and inparticular to generating an aggregate in a data storage system.

Most of today's business and technical software applications requiredatabases for storing and retrieving information. Due to the tremendousgrowth in the volume of data, it is imperative that databases only growover time. However, a larger database requires more retrieval time.Under the term “big data” and “business analytics” business informationsolutions analyze and aggregate a huge amount of data—usually stored indatabase tables—to extract relevant information.

To accelerate access to the information, data often is aggregated inadvance and is materialized in so-called aggregates. Physicalrepresentation of an aggregate might be, e.g., a database table, amaterialized query table (MQT) as known as a (automatic) summary table((A)ST), or a structure stored in memory or in a file system.

Designing aggregates is a challenging task since good aggregates are tosupport a reasonable amount of queries. However, queries are often notknown in advance. Furthermore, there is a trade-off between the numberof queries an aggregate supports and its size, i.e., storageconsumption, maintenance effort and access performance. Furthermore,data in aggregates is to be maintained within a refresh/rollup procedurewhen the underlying container/table is updated. In summary, optimizingdata access using aggregates is complex and requires sophisticatedtechnology.

A couple of technologies are available to address this technical field.Document US 2010/0318527 A1, which is hereby incorporated by referenceherein in its entirety, discloses a method and a system for dynamicallycreating aggregates. An aggregate table manager is instantiated thatreceives a plurality of aggregate table definitions, and generatesaggregate tables based on received aggregate table definitions.

In document U.S. Pat. No. 8,515,948 B2, which is hereby incorporated byreference herein in its entirety, techniques are provided for creatingone or more fine-grained access control rules that are associated with abase table. A materialized query table is created from the base tablewithout applying the one or more fine-grained access control rulesassociated with the base table when obtaining data from the base table.

SUMMARY

According to one aspect, a method for generating an aggregate may beprovided. The method may comprise: upon receiving a second querycomprising a filter criterion, determining if at least a thresholdnumber of previous first queries may comprise a similar filtercriterion, and if so, generating an aggregate using the similar filtercriterion as an aggregation criterion, such that future queriescomprising the similar filter criterion may be satisfied by theaggregate.

According to another aspect, an aggregate builder for generating anaggregate may be provided. The aggregate builder may comprise athreshold determination unit adapted for upon receiving a second querycomprising a filter criterion, determining if at least a thresholdnumber of previous first queries comprises a similar filter criterion,wherein the threshold determination unit is further adapted fortriggering an aggregate generation unit. The aggregate generation unitmay be adapted for generating the aggregate using the similar filtercriterion as an aggregation criterion, such that future queriescomprising the similar filter criterion may be satisfied by theaggregate.

It may be noted that the time in which the second query may be executed,may be after the time either the first query has been performed or agroup of first queries have been performed.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will now be described, by way of exampleonly, and with reference to the following drawings:

FIG. 1 shows one example of a flow chart of an embodiment of generatingan aggregate;

FIG. 2 shows one example of a more detailed flowchart of generating anaggregate;

FIG. 3 shows one example of a structural chart of an embodiment of anaggregate builder; and

FIG. 4 shows one example of a structural chart of an embodiment of acomputer system comprising the aggregate builder.

DETAILED DESCRIPTION

It may be noted that a filter criterion may be composed of a set offilter conditions relating to specific filter fields.

In the context of this description, the following conventions, termsand/or expressions may be used:

The term “aggregate” may denote a prefabricated result summary of agroup of query statements—like e.g., SQL (structure query language) orMDX (Multi-Dimensional Expression)—which may improve the performance ofqueries against a database or other data collections. At the simplestform, an aggregate may be a simple summary table that may be derived byperforming a “Group by SQL” query. A more common use of aggregates maybe to take a dimension and change the granularity of this dimension inthe database.

The term “second query” may—in comparison to a “first query” or previousquery—denote a data retrieval request against a database which mayhappen later than a first query. In general, a query may denote arequest for data from a database.

The term “filter criterion” may denote one or more conditions defined ina query statement.

The term “similar filter criterion” may denote a filter criterion thatmay be related to another filter criterion such that the filter criteriaare not completely different. That may, e.g., be the case if the relatedfiltering criteria comprise identical filtering fields, or if a firstset of filtering fields of the first query is a subset of a second setof filtering fields of the second query. This would mean that the secondquery may be more specific and that the result set of the query wouldalso be a subset of the first query. Additionally, a number of differentfiltering values between the first query and the second query may bebelow a predefined threshold.

The term “filtering field”, in particular in a query, may denote anargument of an SQL statement for which a condition is defined for aquery.

The term “materialized query table” or “materialized view” may denote adatabase object that may contain the results of a query. E.g., it may bea local copy of data located remotely, or may be a subset of the rowsand/or columns of a table or join result, or may be a summary based onaggregations of a table's data. Materialized views, which store databased on remote tables, are also known as snapshots. Also a snapshot maybe a materialized view. All of this may relate to the relationaldatabase model. A view may be seen as a virtual table representing theresult of a database query. Whenever a query or an update addresses anordinary view's virtual table, the database management system mayconvert these into queries or updates against the underlying basetables. A materialized view may take a different approach in which thequery result may be cached as a concrete table that may be updated fromthe original base tables from time to time. This may enable much moreefficient access, at the cost of some data being potentiallyout-of-date. It may be particularly useful in data warehousingscenarios, where frequent queries of the actual base tables can beextremely expensive in terms of computational requirements.

In accordance with one or more aspects, aggregates do not need to bedesigned upfront. No programmer may be required to define and createaggregates. A pre-definition of aggregates may be obsolete. Further,there may be no need to maintain aggregates and aggregate definitions.The design and structure of the aggregates may be created dynamically,i.e., on the fly. An update of underlying base data or underlying datacontainers or tables automatically may invalidate a current aggregate.The creating of new aggregates may be governed by ongoing queryworkload. Thus, the aggregates may be adapted dynamically, automaticallyand permanently. Instead of inflexible large aggregates according tostate of the art technology, small, dense, demand-driven aggregates maybe created increasing the performance of a data storage system like arelational database. This way, existing resources of a computer systemunderlying the data storage system may be used more efficiently by thesedemand-driven dynamic aggregates.

According to one embodiment, a first query and a second query may bedetermined to comprise a similar filter criterion if the relatedfiltering criteria comprise identical filtering fields or, if a firstset of filtering fields of the first query is a subset of a second setof filtering fields of the second query, and wherein a number ofdifferent filtering values between the first query and the second queryis below a predefined threshold. These conditions may be definable insoftware if the method may be a computer implemented method.

In a further embodiment, the technique may comprise determiningcardinalities of values relating to at least one of the filtering fieldscomprised in the filter criterion having generated the aggregate, anddetermining if the cardinalities may exceed a threshold count. If thatis the case, the filter criterion may be narrowed such that a smalleraggregate may be built. Cardinalities may be understood as a number ofoccurrences of values corresponding to a filtering field (e.g., records)in a result set of a query. A narrowing of the filter criterion may,e.g., be achieved by dividing the value range of a related filter fieldinto an upper portion and a lower portion, i.e., cut the value rangeinto two halves. However, any other technique for lowering the number ofresults in the result set of a query may be adequate.

In an alternative embodiment, it may be determined if an estimatedaggregate size for a filter criterion may exceed a size threshold. Ifthat is the case, the filter criterion may be narrowed down such that asmaller sized aggregate may be built. Such a narrowed down or restrictedfiltering condition on at least one filtering field may be generatedsuch that the estimated size of the corresponding aggregate may bereduced below the threshold size. The calculation of the estimatedaggregate size may be based on a size of result sets of previous queriesand a number of valid or allowed values for one or more filteringfields. A valid or allowed value may be a value that fits into thecontext of the value and may thus be flagged in related meta dataaccordingly. E.g., a calendar year has only 12 months, thus, a monthwith a value 14 may not be allowed. The same may apply to store-IDs of aretailer who may only have a number X of stores. Thus, a number greaterthan the maximum_store_number may not be allowed.

According to one additional embodiment, the query expressions of queriesmay be stored. This may be the basis for comparing them later-on withprevious queries and determining their similarity.

According to a further embodiment, the aggregate may be stored as adatabase table. Alternatively or in addition, the aggregate may bestored as a materialized query table or a materialized view. This mayhave the advantage that storage technologies and devices already in usemay be re-used.

In a further embodiment, the aggregate may be stored as a data structurein a memory. This may be RAM (random access memory), an SSD (solid statedisk) or other storage media of a computer. Furthermore, the aggregatemay be stored as a file in a file system using capabilities of anoperating system for efficiency.

In several embodiments, the aggregate may relate to a relationaldatabase and may be generated as a component of such a relationaldatabase, thereby making use of the inherited functions of a modernrelational database system.

In other embodiments, the aggregate may be generated as part of ahierarchical database or a network database. Potentially, also datastores for unstructured data, e.g., a CMS (content management system)may be possible.

Furthermore, embodiments may take the form of a computer programproduct, accessible from a computer-usable or computer-readable mediumproviding program code for use, by or in connection with a computer orany instruction execution system. For the purpose of this description, acomputer-usable or computer-readable medium may be any apparatus thatmay contain means for storing, communicating, propagating ortransporting the program for use, by or in a connection with theinstruction execution system, apparatus, or device.

The medium may be an electronic, magnetic, optical, electromagnetic,infrared or a semi-conductor system for a propagation medium. Examplesof a computer-readable medium may include a semi-conductor or solidstate memory, magnetic tape, a removable computer diskette, a randomaccess memory (RAM), a read-only memory (ROM), a rigid magnetic disk andan optical disk. Current examples of optical disks include compactdisk-read only memory (CD-ROM), compact disk-read/write (CD-R/W), DVDand Blu-Ray-Disk.

It should also be noted that embodiments of the invention have beendescribed with reference to different subject-matters. In particular,some embodiments have been described with reference to method typeclaims whereas other embodiments have been described with reference toapparatus type claims. However, a person skilled in the art will gatherfrom the above and the following description that, unless otherwisenotified, in addition to any combination of features belonging to onetype of subject-matter, also any combination between features relatingto different subject-matters, e.g., between features of the method typeclaims, and features of the apparatus type claims, is considered to bedisclosed within this document.

The aspects defined above and further aspects of the present inventionare apparent from the examples of embodiments to be describedhereinafter and are explained with reference to the examples ofembodiments, but to which the invention is not limited.

In the following, a detailed description of the figures will be given.All instructions in the figures are schematic. Firstly, block diagramsof embodiments of a method for generating an aggregate are given.Afterwards, further embodiments of an aggregate builder are described.

FIG. 1 shows an embodiment of a method 100 for generating an aggregate.Method 100 for generating an aggregate, in particular in a database, ingeneral or a usual relational database, may be composed of severalsteps: Upon receiving, 102, a query—which may for comprehensibilityreasons be denoted as a second query and may comprise a filtercriterion, which may be a group of conditions, e.g., expressed as an SQLstatement—a determination may be performed, 104, to evaluate if at leasta threshold number of previous first queries comprises a similar filtercriterion. A previous first query may have been executed during apredefined time period before the received query. If the similaritycomparison yields a positive answer, the method may comprise generating,106, an aggregate using the similar filter criterion as an aggregationcriterion, such that future queries comprising the similar filtercriterion are satisfied by the aggregate.

FIG. 2 shows a more detailed flowchart of generating an aggregate. Aquery may be received, 202. It may be tested whether an appropriateaggregate exists, 204. In case of “YES”, the query may be executedagainst the aggregate, 206. In case of “NO”, a counter may be increased,208, indicating that no appropriate query exists. The counter may berelated to the query being executed. If the numeric value of the counterdoes not exceed a predefined threshold, 210, the query may be run 212,against base data. However, if the numeric value of the counter exceedsa predefined threshold, the method may—together or based on a relatedsystem—create, 214, the aggregate relating to the received query. Forfuture queries of the same or similar type, the query may now be run,206, against the aggregate which may be executed faster compared to aquery being executed against the underlying database.

It may be noted that not only exactly the same query may be satisfied bythe aggregate, but also queries with similar filter criterions becausethe aggregate has been built on an abstraction layer above the nativequery. Thus, the aggregate is a generalization of the native query.

In order to illustrate the functionality of the aggregate building, thefollowing example may be considered. It queries the sum of revenue fromdifferent retail stores having different retail store numbers for thesame day:

TABLE 1 seq SQL query count 1 SELECT SUM(REV) FROM BIG_TAB 1   WHERECALMONTH = 06.2013 AND STORE_ID = 5 2 SELECT SUM(REV) FROM BIG_TAB 2  WHERE CALMONTH = 06.2013 AND STORE_ID = 4 3 SELECT SUM(REV) FROMBIG_TAB 3   WHERE CALMONTH = 06.2013 AND STORE_ID = 5 AGGREGATE BUILDINGrewrite SELECT SUM(REV) FROM AGGREGATE     WHERE STORE_ID = 5 4 SELECTSUM(REV) FROM BIG_TAB   WHERE CALMONTH = 06.2013 AND STORE_ID = 8 →rewrite SELECT SUM(REV) FROM AGGREGATE     WHERE STORE_ID = 8 5 SELECTSUM(REV) FROM BIG_TAB   WHERE CALMONTH = 06.2013 AND STORE_ID = 4 →rewrite SELECT SUM(REV) FROM AGGREGATE     WHERE STORE_ID = 4 ...

“seq” may indicate the sequence of incoming queries, and “count” mayindicate the counter for queries having a similar filter criterion.

In this example, the number of similar queries may have reached a giventhreshold, e.g. “3”. Thus, at sequence 3, an aggregate may be built thatmay be used for all possible future queries that may comply with thesame/similar query pattern. The aggregate may be built by the followingquery:

SELECT REV, STORE_ID FROM BIG_TAB

-   -   WHERE CALMONTH=06.2013 AND STORE_ID=*

“*” may indicate a wildcard, meaning that any value may be possible. InSQL this is equivalent to removing the condition at all. The query hasbeen generalized in order to satisfy the similarity condition. Allsubsequent queries that are supported by the aggregate may be executedagainst the aggregate. Moreover, the rewrite process may betransparent—meaning invisible—for the caller/user/retrieval process.

To continue with the example: The table BIG_TAB may have the followinglayout:

TABLE 2 BIG_TAB CALMONTH STORE_ID CUSTOMER REVENUE . . . 05.2013 5 002585,43 . . . 06.2013 4 001 123,43 . . . 06.2013 5 002 222,22 . . .06.2013 5 003 333,22 . . . 06.2013 5 004 111,11 . . . 06.2013 8 003543,38 . . . 07.2013 5 002 285,87 . . .

An appropriate aggregate may be built by the SQL statement

CREATE AGGREGATE AS (SELECT STORE_ID, SUM(REV) FROM BIG_TAB

-   -   WHEREIN CALMONTH=06.2013 GROUP BY STORE_ID)

The aggregate may look like this:

TABLE 3 AGGREGATE STORE_ID REVENUE 4 123,43 5 666,66 8 543,38

Such an aggregate may speed up future queries having similar selectioncriteria, as explained above.

If in the example of table 3 it may be determined that cardinalities ofvalues of at least one of the filtering fields—i.e., number of lines oftable 3—exceed a threshold count, the related filtering criterion may benarrowed. It may be noted that the filtering fields are here STORE_IDand CALMONTH. If, e.g., the threshold count may be predefined as or setto 10 and the number of lines in the result table, i.e., table 3, may be12 to build the aggregate, the filter criterion to create the aggregatemay be narrowed by dividing the month in two halves or, by building anaggregate for only 50% of all stores. Hence, either the filtering fieldCALMONTH or STORE_ID may get an additional condition on the relatedSQL-statement. This may reduce the number of lines, i.e. cardinalitiesof values. Thus, a smaller, more effective aggregate may be built.

FIG. 3 shows a structural depiction of an embodiment of the aggregatebuilder 300 for generating an aggregate. It may comprise a thresholddetermination unit 302 adapted for: upon receiving a second querycomprising a filter criterion, determining if at least a thresholdnumber of previous first queries comprises a similar filter criterion,wherein the threshold determination unit 302 is further adapted fortriggering an aggregate generation unit 304. The aggregate generationunit 304 may be adapted for generating the aggregate using the similarfilter criterion as an aggregation criterion. This way, future queriescomprising the similar filter criterion, may be satisfied by theaggregate.

Embodiments of the invention may be implemented together with virtuallyany type of computer, regardless of the platform, being suitable forstoring and/or executing program code. For example, as shown in FIG. 4,a computing system 400 may include one or more processor(s) 402 with oneor more cores per processor, associated memory elements 404, an internalstorage device 406 (e.g., a hard disk, an optical drive, such as acompact disk drive or digital video disk (DVD) drive, a flash memorystick, a solid-state disk, etc.), and numerous other elements andfunctionalities, typical of today's computers (not shown). The memoryelements 404 may include a main memory, e.g., a random access memory(RAM), employed during actual execution of the program code, and a cachememory, which may provide temporary storage of at least some programcode and/or data in order to reduce the number of times code and/or datais to be retrieved from a long-term storage medium or external bulkstorage 416 for an execution. Elements inside the computer 400 may belinked together by means of a bus system 418 with correspondingadapters. Additionally, the aggregate builder 300 may be attached to thebus system 418.

The computing system 400 may also include input means, such as akeyboard 408, a pointing device such as a mouse 410, or a microphone(not shown). Alternatively, the computing system may be equipped with atouch sensitive screen as a main input device. Furthermore, the computer400, may include output means, such as a monitor or screen 412 [e.g., aliquid crystal display (LCD), a plasma display, a light emitting diodedisplay (LED), or cathode ray tube (CRT) monitor]. The computer system400 may be connected to a network (e.g., a local area network (LAN), awide area network (WAN), such as the Internet or any other similar typeof network, including wireless networks via a network interfaceconnection 414. This may allow a coupling to other computer systems or astorage network or a tape drive. Those skilled in the art willappreciate that many different types of computer systems exist, and theaforementioned input and output means may take other forms. Generallyspeaking, the computer system 400 may include at least the minimalprocessing, input and/or output means, necessary to practice embodimentsof the invention.

While aspects of the invention have been described with respect to alimited number of embodiments, those skilled in the art, having benefitof this disclosure, will appreciate that other embodiments may bedevised, which do not depart from the scope of the invention, asdisclosed herein. Accordingly, the scope of the invention should belimited only by the claims. Also, elements described in association withdifferent embodiments may be combined. It should also be noted thatreference signs in the claims, if any, should not be construed aslimiting elements.

As will be appreciated by one skilled in the art, aspects of the presentdisclosure may be embodied as a system, method or computer programproduct. Accordingly, aspects of the present disclosure may take theform of an entirely hardware embodiment, an entirely software embodiment(including firmware, resident software, micro-code, etc.) or anembodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present disclosure may take the form of acomputer program product embodied in one or more computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may beutilized. The computer readable medium may be a computer readable signalmedium or a computer readable storage medium. A computer readablestorage medium may be, for example, but not limited to, an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system,apparatus, or device, or any suitable combination of the foregoing. Morespecific examples (a non-exhaustive list) of the computer readablestorage medium would include the following: an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk, arandom access memory (RAM), a read-only memory (ROM), an erasableprogrammable read-only memory (EPROM or Flash memory), an optical fiber,a portable compact disc read-only memory (CD-ROM), an optical storagedevice, a magnetic storage device, or any suitable combination of theforegoing. In the context of this document, a computer readable storagemedium may be any tangible medium that may contain, or store a programfor use by or in connection with an instruction execution system,apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that may communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc., or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for aspects of thepresent invention may be written in any combination of one or moreprogramming languages, including an object oriented programming languagesuch as Java, Smalltalk, C++ or the like and conventional proceduralprogramming languages, such as the “C” programming language or similarprogramming languages. The program code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

Aspects of the present disclosure are described with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems) and computer program products according to embodiments of thepresent disclosure. It will be understood that each block of theflowchart illustrations and/or block diagrams, and combinations ofblocks in the flowchart illustrations and/or block diagrams, may beimplemented by computer program instructions. These computer programinstructions may be provided to a processor of a general purposecomputer, special purpose computer, or other programmable dataprocessing apparatus to produce a machine, such that the instructions,which execute via the processor of the computer or other programmabledata processing apparatus, create means for implementing thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

These computer program instructions may also be stored in a computerreadable medium that may direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions, whichimplement the function/act specified in the flowchart and/or blockdiagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions, which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

The block diagrams in the Figures illustrate the architecture,functionality, and operation of possible implementations of systems,methods and computer program products according to various embodimentsof the present disclosure. In this regard, each block in the blockdiagrams may represent a module, segment, or portion of code, whichcomprises one or more executable instructions for implementing thespecified logical function(s). It should also be noted that, in somealternative implementations, the functions discussed hereinabove mayoccur out of the disclosed order. For example, two functions taught insuccession may, in fact, be executed substantially concurrently, or thefunctions may sometimes be executed in the reverse order, depending uponthe functionality involved. It will also be noted that each block of theblock diagrams, and combinations of blocks in the block diagrams, may beimplemented by special purpose hardware-based systems that perform thespecified functions or acts, or combinations of special purpose hardwareand computer instructions.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to limit the invention. As usedherein, the singular forms “a”, “an” and “the” are intended to includethe plural forms as well, unless the context clearly indicatesotherwise. It will be further understood that the terms “comprises”and/or “comprising,” when used in this specification, specify thepresence of stated features, integers, steps, operations, elements,and/or components, but do not preclude the presence or addition of oneor more other features, integers, steps, operations, elements,components, and/or groups thereof.

The corresponding structures, materials, acts, and equivalents of allmeans or steps plus function elements in the claims below, if any, areintended to include any structure, material, or act for performing thefunction in combination with other claimed elements, as specificallyclaimed. The description of the present invention has been presented forpurposes of illustration and description, but is not intended to beexhaustive or limited to the invention in the form disclosed. Manymodifications and variations will be apparent to those of ordinary skillin the art without departing from the scope and spirit of the invention.The embodiment was chosen and described in order to best explain theprinciples of the invention and the practical application, and to enableothers of ordinary skill in the art to understand the invention forvarious embodiments with various modifications, as are suited to theparticular use contemplated.

What is claimed is:
 1. A method of generating an aggregate, the methodcomprising: receiving, by one or more processors, a second query forexecution subsequent to executing a number of previous first queries;determining, by the one or more processors, if the second query can beexecuted against an aggregate to obtain data responsive to the secondquery, wherein the aggregate comprises a prefabricated result summary ofa group of query statements; based on determining that the second querycannot be executed against the aggregate, obtaining, by the one or moreprocessors, a value from a counter, wherein the value indicates a numberof times the second query has been received by the one or moreprocessors for execution; based on determining that the second query hasbeen received by the one or more processors for execution a predefinednumber of times, optimizing, by the one or more processors, efficiencyrelated to data accesses in a database by generating a new aggregate,the generating comprising: based on receiving the second querycomprising a filter criterion, determining whether at least a thresholdnumber of previous first queries executed during a predefined timeperiod before receiving the second query comprise a similar filtercriterion, wherein a first query of the at least the threshold number ofprevious first queries comprises the similar filter criterion if a firstset of filtering fields of the first query is a subset of a second setof filtering fields of the second query and a number of differentfiltering values between the first query and the second query is below apredefined threshold number of different filtering values; based ondetermining, responsive to receiving the second query, that the at leastthe threshold number of previous first queries comprise the similarfilter criterion, dynamically generating the new aggregate, on-the-fly,using the similar filter criterion as an aggregation criterion, whereinfuture queries comprising the similar filter criterion or comprising thefilter criterion, are satisfied by executing the future queries againstthe new aggregate instead of against containers or tables of thedatabase comprising underlying base data of the new aggregate, whereinthe new aggregate is generated using filtering fields comprised inaggregate generating filter criterion, wherein the second query issatisfied by being executed against the new aggregate and the secondquery comprises a native query of the new aggregate, and wherein the newaggregate is generated on an abstraction layer above the native querysuch that the new aggregate is a generalization of the native query; andsatisfying, by the one or more processors, the second query by executingthe second query against the new aggregate, wherein the new aggregateprovides data responsive to the second query; receiving, by the one ormore processors, a new query for execution against the database toaccess data in the database; determining, by the one or more processors,that the new query comprises the similar filter criterion or the filtercriterion; determining, by the one or more processors, if one or more ofthe containers or the tables of the database comprising the underlyingbase data of the new aggregate have been updated; based on determiningthat the underlying base data of the new aggregate have been updated,automatically invalidating, by the one or more processors, the newaggregate and automatically executing the new query against thecontainers or the tables of the database comprising the underlying basedata of the new aggregate; based on determining that the underlying basedata of the new aggregate have not been updated, automaticallyexecuting, by the one or more processors, the new query against the newaggregate instead of against the containers or the tables of thedatabase comprising the underlying base data of the new aggregate,wherein the new aggregate provides data responsive to the new query,wherein the executing comprises: determining, based on executing the newquery against the new aggregate, that a number of occurrences of valuescorresponding to a filtering field comprised in the filter criterionhaving generated the new aggregate exceed a threshold count; andgenerating a smaller aggregate based on narrowing the new aggregate bydividing the value range of the filter field into an upper portion and alower portion to lower a number of results in the result set of the newquery.
 2. The method according to claim 1, further comprising:determining whether an estimated size aggregate for a selected filtercriterion exceeds a size threshold; and based on the estimated sizeaggregate for the selected filter criterion exceeding the sizethreshold, narrowing the selected filter criterion such that a smallersized aggregate is built.
 3. The method according to claim 1, whereinquery expressions of at least one of the second query and one or more ofthe first queries are stored.
 4. The method according to claim 1,wherein the new aggregate is stored as a database table.
 5. The methodaccording to claim 1, wherein the new aggregate is a materialized querytable.
 6. The method according to claim 1, wherein the new aggregate isstored as a data structure in a memory of a computer.
 7. The methodaccording to claim 1, wherein the new aggregate is stored as a file in afile system.
 8. The method according to claim 1, wherein the newaggregate relates to a relational database.
 9. The method according toclaim 1, wherein the new aggregate is generated as part of ahierarchical database or a network database.
 10. An aggregate builderfor generating an aggregate, the aggregate builder comprising: a memory;one or more processors in communication with the memory; and programinstructions executable by the one or more processors via the memory toperform a method, the method comprising: receiving, by the one or moreprocessors, a second query for execution subsequent to executing anumber of previous first queries; determining, by the one or moreprocessors, if the second query can be executed against an aggregate toobtain data responsive to the second query, wherein the aggregatecomprises a prefabricated result summary of a group of query statements;based on determining that the second query cannot be executed againstthe aggregate, obtaining, by the one or more processors, a value from acounter, wherein the value indicates a number of times the second queryhas been received by the one or more processors for execution; based ondetermining that the second query has been received by the one or moreprocessors for execution a predefined number of times, optimizing, bythe one or more processors, efficiency related to data accesses in adatabase by generating a new aggregate, the generating comprising: basedon receiving the second query comprising a filter criterion, determiningwhether at least a threshold number of previous first queries executedduring a predefined time period before receiving the second querycomprise a similar filter criterion, wherein a first query of the atleast the threshold number of previous first queries comprises thesimilar filter criterion if a first set of filtering fields of the firstquery is a subset of a second set of filtering fields of the secondquery and a number of different filtering values between the first queryand the second query is below a predefined threshold number of differentfiltering values; based on determining, responsive to receiving thesecond query, that the at least the threshold number of previous firstqueries comprise the similar filter criterion, dynamically generatingthe new aggregate, on-the-fly, using the similar filter criterion as anaggregation criterion, wherein future queries comprising the similarfilter criterion or comprising the filter criterion, are satisfied byexecuting the future queries against the new aggregate instead ofagainst containers or tables of the database comprising underlying basedata of the new aggregate, wherein the new aggregate is generated usingfiltering fields comprised in aggregate generating filter criterion,wherein the second query is satisfied by being executed against the newaggregate and the second query comprises a native query of the newaggregate, and wherein the new aggregate is generated on an abstractionlayer above the native query such that the new aggregate is ageneralization of the native query; and satisfying, by the one or moreprocessors, the second query by executing the second query against thenew aggregate, wherein the new aggregate provides data responsive to thesecond query; receiving, by the one or more processors, a new query forexecution against the database to access data in the database;determining, by the one or more processors, that the new query comprisesthe similar filter criterion or the filter criterion; determining, bythe one or more processors, if one or more of the containers or thetables of the database comprising the underlying base data of the newaggregate have been updated; based on determining that the underlyingbase data of the new aggregate have been updated, automaticallyinvalidating, by the one or more processors, the new aggregate andautomatically executing the new query against the containers or thetables of the database comprising the underlying base data of the newaggregate; based on determining that the underlying base data of the newaggregate have not been updated, automatically executing, by the one ormore processors, the new query against the new aggregate instead ofagainst the containers or the tables of the database comprising theunderlying base data of the new aggregate, wherein the new aggregateprovides data responsive to the new query, wherein the executingcomprises: determining, based on executing the new query against the newaggregate, that a number of occurrences of values corresponding to afiltering field comprised in the filter criterion having generated thenew aggregate exceed a threshold count; and generating a smalleraggregate based on narrowing the new aggregate by dividing the valuerange of the filter field into an upper portion and a lower portion tolower a number of results in the result set of the new query.
 11. Acomputer program product for generating an aggregate, the computerprogram product comprising: a non-transitory computer readable storagemedium readable by a processing circuit and storing instructions forexecution by the processing circuit for performing a method comprising:receiving, by the one or more processors, a second query for executionsubsequent to executing a number of previous first queries; determining,by the one or more processors, if the second query can be executedagainst an aggregate to obtain data responsive to the second query,wherein the aggregate comprises a prefabricated result summary of agroup of query statements; based on determining that the second querycannot be executed against the aggregate, obtaining, by the one or moreprocessors, a value from a counter, wherein the value indicates a numberof times the second query has been received by the one or moreprocessors for execution; based on determining that the second query hasbeen received by the one or more processors for execution a predefinednumber of times, optimizing, by the one or more processors, efficiencyrelated to data accesses in a database by generating a new aggregate,the generating comprising: based on receiving the second querycomprising a filter criterion, determining whether at least a thresholdnumber of previous first queries executed during a predefined timeperiod before receiving the second query comprise a similar filtercriterion, wherein a first query of the at least the threshold number ofprevious first queries comprises the similar filter criterion if a firstset of filtering fields of the first query is a subset of a second setof filtering fields of the second query and a number of differentfiltering values between the first query and the second query is below apredefined threshold number of different filtering values; based ondetermining, responsive to receiving the second query, that the at leastthe threshold number of previous first queries comprise the similarfilter criterion, dynamically generating the new aggregate, on-the-fly,using the similar filter criterion as an aggregation criterion, whereinfuture queries comprising the similar filter criterion or comprising thefilter criterion, are satisfied by executing the future queries againstthe new aggregate instead of against containers or tables of thedatabase comprising underlying base data of the new aggregate, whereinthe new aggregate is generated using filtering fields comprised inaggregate generating filter criterion, wherein the second query issatisfied by being executed against the new aggregate and the secondquery comprises a native query of the new aggregate, and wherein the newaggregate is generated on an abstraction layer above the native querysuch that the new aggregate is a generalization of the native query; andsatisfying, by the one or more processors, the second query by executingthe second query against the new aggregate, wherein the new aggregateprovides data responsive to the second query; receiving, by the one ormore processors, a new query for execution against the database toaccess data in the database; determining, by the one or more processors,that the new query comprises the similar filter criterion or the filtercriterion; determining, by the one or more processors, if one or more ofthe containers or the tables of the database comprising the underlyingbase data of the new aggregate have been updated; based on determiningthat the underlying base data of the new aggregate have been updated,automatically invalidating, by the one or more processors, the newaggregate and automatically executing the new query against thecontainers or the tables of the database comprising the underlying basedata of the new aggregate; based on determining that the underlying basedata of the new aggregate have not been updated, automaticallyexecuting, by the one or more processors, the new query against the newaggregate instead of against the containers or the tables of thedatabase comprising the underlying base data of the new aggregate,wherein the new aggregate provides data responsive to the new query,wherein the executing comprises: determining, based on executing the newquery against the new aggregate, that a number of occurrences of valuescorresponding to a filtering field comprised in the filter criterionhaving generated the new aggregate exceed a threshold count; andgenerating a smaller aggregate based on narrowing the new aggregate bydividing the value range of the filter field into an upper portion and alower portion to lower a number of results in the result set of the newquery.
 12. The computer program product according to claim 11, whereinthe method further comprises: determining whether an estimated sizeaggregate for a selected filter criterion exceeds a size threshold; andbased on the estimated size aggregate for the selected filter criterionexceeding the size threshold, narrowing the selected filter criterionsuch that a smaller sized aggregate is built.
 13. The computer programproduct according to claim 11, wherein query expressions of at least oneof the second query and one or more of the first queries are stored. 14.The computer program product according to claim 11, wherein the newaggregate is stored as a database table.
 15. The computer programproduct according to claim 11, wherein the new aggregate is amaterialized query table.
 16. The computer program product according toclaim 11, wherein the new aggregate is stored as a data structure in amemory of a computer, or as a file in a file system.